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ABSTRACT 

We present the results from an automated search for damped Lya (DLA) systems in the quasar spectra 
of Data Release 1 from the Sloan Digital Sky Survey (SDSS-DRl). At z « 2.5, this homogeneous dataset 
has greater statistical significance than the previous two decades of research. We derive a statistical 
sample of 71 damped Lya systems (> 50 previously unpublished) at z > 2.1 and measure HI column 
densities directly from the SDSS spectra. The number of DLA systems per unit redshift is consistent with 
previous measurements and we expect our survey has > 95% completeness. We examine the cosmological 
baryonic mass density of neutral gas fig inferred from the damped Lya systems from the SDSS-DRl 
survey and a combined sample drawn from the literature. Contrary to previous results, the Jig values 
do not require a significant correction from Lyman limit systems at any redshift. We also find that the 
fig values for the SDSS-DRl sample do not decline at high redshift and the combined sample shows 
a (statistically insignificant) decrease only at z > 4. Future data releases from SDSS will provide the 
definitive survey of DLA systems at z « 2.5 and will significantly reduce the uncertainty in Qg at higher 
redshift. 

Subject headings: Galaxies: Evolution, Galaxies: Intergalactic Medium, Galaxies; Quasars: Absorption 
Lines 

1. INTRODUCTION 



It has now been two decades since the inception of sur- 
veys for high redshift galaxies through the signature of 
damped Lya (DLA) absorption in the spectra of back- 
ground quasars (Wolfe et al. 1986). Owing to large neu- 
tral hydrogen column densities A^(HI), these absorption 
lines exhibit large rest equivalent widths {W\ > lOA) and 
show the Lorentzian wings characteristic of quantum me- 
chanic line-damping. Through dedicated surveys of high 
and low redshift quasars with optical and ultraviolet tele- 
scopes, over 300 damped Lya systems have been identified. 
These galaxies span redshifts z = (the Milky Way, LMC, 
SMC) to z = 5.5 where the opacity of the Lya forest pre- 
cludes detection (Songaila & Cowie 2002). 

Statistics of the DLA systems impact a wide range of 
topics in modern cosmology, galaxy formation, and physics. 
These include studies on the chemical enrichment of the 
universe in neutral gas (Pettini et al. 1994; Prochaska 
et al. 2003b), nucleosynthetic processes (Lu et al. 1996; 
Prochaska, Howk, & Wolfe 2003), galactic velocity fields 
(Prochaska & Wolfe 1997), the molecular and dust con- 
tent of young galaxies (Vladilo 1998; Ledoux, Srianand, & 
Petitjean 2003), star formation rates (Wolfe, Prochaska, 
& Gawiser 2003), and even constraints on temporal evo- 
lution of the fine-structure constant (Webb et al. 2001). 
Perhaps the most fundamental measurement from DLA 
surveys, however, is the evolution of the cosmological bary- 
onic mass density in neutral gas fig (Storrie-Lombardi and 
Wolfe 2000; Rao & Turnshek 2000; Peroux et al. 2003, 
hereafter PMSI03). Because the DLA systems dominate 
the mass density of neutral gas from z = to at least 
z = 3.5, a census of these absorption systems determines 
directly fig. These measurements express global evolution 
in the gas which feeds star formation (Pei & Fall 1995; 
Mathlin et al. 2001) and are an important constraint for 
models of hierarchical galaxy formation (e.g. Somerville, 
Primack, & Faber 2001; Nagamine, Springel, & Hernquist 
2004a). 



The most recent compilation of damped Lya systems 
surveyed in a 'blind', statistical manner combines the ef- 
fects of observing programs using over 10 telescopes, 10 
unique instruments, and the data reduction and analysis 
of « 10 different observers (PMSI03). In short, the results 
are derived from a heterogeneous sample of quasar spectra 
derived from heterogeneous quasar surveys. While consid- 
erable care has been paid to collate these studies into an 
unbiased analysis, it is difficult to assess the completeness 
and potential selection biases of the current sample. These 
issues are particularly important when one aims to ad- 
dress the impact of effects like dust obscuration (Ostriker 
& Heisler 1984; Fall & Pei 1993; Ellison et al. 2001). 

In this paper we present the first results in a large sur- 
vey for damped Lya systems drawn from a homogeneous 
dataset of high z quasars with well-defined selection crite- 
ria. Specifically, we survey the quasar spectra from Data 
Release 1 of the Sloan Digital Sky Survey (SDSS-DRl) re- 
stricting our search to SDSS-DRl quasars with Petrosian 
magnitude r' < 19.5 mag. The DRl sample alone (the first 
of five data releases from SDSS) offers a survey comparable 
to - although not strictly independent from - the efforts 
of 20 years of work. We introduce algorithms to auto- 
matically identify DLA candidates in the fluxed (i.e. non- 
normalized) quasar spectra and perform Voigt profile anal- 
yses to confirm and analyze the DLA sample. This survey 
was motivated by a search for 'metal-strong' DLA systems 
like the z=2.626 damped Lya system toward FJ0812 -|- 32 
(Prochaska, Howk, & Wolfe 2003). A discussion of the 
'metal-strong' survey will be presented in a future paper 
(Herbert-Fort et al. 2004, in preparation). 

This paper is organized as follows. In § 2, we present the 
quasar sample and discuss the automatic DLA candidate 
detection. In § 3, we present the Voigt profile fits to the 
full sample. We present a statistical analysis in § 4 and a 
summary and concluding remarks are given in § 5. 
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2. QUASAR SAMPLE AND DLA CANDIDATES 

The quasar sample was drawn from Data Release 1 of 
the Sloan Digital Sky Survey to a limiting Pctrosian mag- 
nitude of r' = 19.5 mag. This criterion was chosen pri- 
marily to facilitate follow-up observations with lOm-class 
telescopes and it includes > 60% of all SDSS-DRl quasars 
at z > 2. With rare exception, the fiber-fed SDSS spec- 
trograph provides FWHM « 150 km s^^ spectra of each 
quasar for the wavelength range A « 3800 - 9200A. All 
of the spectra were reduced using the SDSS spectrophoto- 
metric pipeline (Buries & Schlegcl 2004) and were re- 
trieved from the SDSS data archive^ (Abazajian et al. 
2003). 

The first step of a damped Lya survey is to establish the 
redshift pathlength available to the discovery of DLA sys- 
tems. The minimum starting wavelength of 3800A corre- 
sponds to z = 2.12 for the Lya transition and this sets the 
lowest redshift accessible to this survey. For each quasar, 
however, we define a unique starting redshift Zstart by iden- 
tifying the first pixel where the median SNR over 20 pix- 
els exceeds 4. This criterion was chosen to (1) minimize 
the likelihood of identifying noise features as DLA sys- 
tems; (2) achieve a high completeness limit; (3) account for 
the presence of Lyman limit absorption. Consistent with 
previous studies, the ending redshift Zend corresponds to 
3000 km s^^ bhicward of Lya emission. This criterion lim- 
its the probability of identifying DLA systems associated 
with the quasar which may bias the analysis. 

Special consideration is given to quasar spectra which 
show significant absorption lines at the quasar emission 
redshift (e.g. CIV, O VI). In previous studies. Broad Ab- 
sorption Line (BAL) quasars have been removed from the 
analysis primarily to prevent confusion with intrinsic O VI 
and/or NV absorption. We take a less conservative ap- 
proach here. We visually inspected the 1252 quasars with 
Zem > 2.1 and r' < 19.5 to identiiy quasars with associ- 
ated absorption. In these cases, we limit the DLA search 
to lOOA redward of OVI emission and 100 A blueward of 
Lya emission. However, if BAL contamination is deter- 
mined to be too severe the quasar is rejected from further 
analysis. 

The majority of previous DLA surveys relied on low 
resolution 'discovery' spectra to first identify DLA candi- 
dates. Follow-up observations were than made of these 
candidates to confirm DLA systems and measure their 
A'' (HI) values. A tremendous advantage of the SDSS spec- 
tra is that they have sufficient resolution to both readily 
identify DLA candidates and measure their 7V(HI) values. 
DLA candidates were identified using an algorithm tuned 
to the characteristics of the damped Lya profile, in par- 
ticular its wide, saturated core. Our DLA-searching algo- 
rithm first determines a characteristic signal-to-noise ratio 
(SNRgso) for each quasar spectrum. Ideally, we calculate 
this value blueward of Lya emission specifically by taking 
the median SNR of 150 pixels lying 51-200 pixels blueward 
of Lya emission. If the Lya emission peak is at less than 
200 pixels from the start of the spectrum, then we calculate 
SNRgso from the median SNR of the 150 pixies starting 50 
pixels redward of Lya emission. We then define a quantity 
ni=SNR,;.5o/2.5 restricted to have a value between 1 and 

^ http;//www.sdss.org 



2. At each pixel j in the spectrum, we then measure the 
fraction of pixels with SNR^ < m in a window 6{1 + zj) 
pixels wide where Zj = Aj/1215.67A— 1. This window was 
chosen to match the width of the core of a DLA profile with 
SDSS spectral resolution and sampling. Importantly (for 
fiber data), the algorithm is relatively insensitive to the ef- 
fects of poor sky-subtraction. Furthermore, we stress that 
continuum fitting is unnecessary; the algorithm works di- 
rectly on the fluxed data because it focuses primarily on 
the core of the damped Lya profile. 

This algorithm was developed through tests on both 
simulated spectra with resolution and SNR comparable 
to SDSS data and also on a sub-set of SDSS spectra with 
known DLA systems. Our tests indicate that DLA candi- 
dates correspond to windows where the fraction of pixels 
with SNR < m exceeds 60%. We recorded all regions sat- 
isfying this criterion and reduced them to individual candi- 
dates by grouping within 2000 km s^^ bins. In a sample 
of 1000 trails on simulated spectra with random Af(HI) 
and redshift, we recover 100% of all DLA systems with 
logiV(HI) > 20.4 and all but « 5% of the DLA systems 
with A^(HI) « 2 X 10^" cm^^. The algorithm is conserva- 
tive in that it triggers many false positive detections, the 
majority of which are BAL features or blended Lya clouds. 
With custom software, it is easy to visually identify and 
account for these cases. 

Table 1 hsts the full sample of SDSS-DRl quasars. The 
columns give the name, Zem, Zstart, Zend, a flag for BAL 
characteristics, and redshifts of DLA candidates including 
the false positive detections. 

3. N{hl) ANALYSIS 

The automated algorithm described in the previous sec- 
tion triggered 286 DLA candidates. We visually inspected 
the full set of candidates and identified « 100 as obvious 
false positive detections. For the remainder of the sys- 
tems, we fit a local continuum and a Voigt profile with 
FWHM = 2 pixels to the data. The Voigt profile fits to 
the DLAs quoted in this paper are centered on the redshift 
determined by associated metal-line absorption. Because 
the metal lines are narrow, these redshifts are determined 
precisely. As emphasized by Prochaska et al. (2003a), the 



Table 1 
SDSS QUASAR SAMPLE 



Name 




Zstart 


Zerid 


Ibat. 


^candidate 


J094454.24-004330.3 


2.292 


2.150 


2.259 







J095253.84+011422.1 


3.024 


2.154 


2.984 





2.204,2.381 


J100412.88+001257.5 


2.239 


2.156 


2.207 







J100553.34+001927.1 


2.501 


2.155 


2 Am 







J101014.25-001015.2 


2.190 


2.143 


2.158 







J101748.90-003124.5 


2.283 


2.156 


2.2-50 







J101859.96-005420.2 


2.183 


2.147 


2.151 







J102606.67+011459.0 


2.266 


2.157 


2.233 







J102636.96-I-001530.2 


2.178 











J102650.39-I-010518.3 


2.274 


2.177 


2.192 


1 





"0=No BAL activity; l=Modest BAL activity, included in analysis; 2=Strong I 
Note. — [The complete version of this table is in the electronic edition.] 
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Fig. 1. — Lya profiles of the 71 damped Lya systems comprising the full statistical sample from the SDSS Data Release 1. The dotted line 
traces the assumed continuum of the quasar and the green solid line is a Voigt profile corresponding to the A'^(HI) values given in Table 2. 
All plots have angstroms along the x-axis and flux (/;\ X 10^^ cgs) along the y-axis. 
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A'^(HI) analysis is dominated by systematic error associ- 
ated with continuum fitting and line blending of coinci- 
dent Lya clouds. The statistical error based on a min- 
imization routine would be unrealistically low and largely 
unmeaningful. Therefore, we perform a visual fit to the 
data and report a conservative systematic error which we 
believe encompasses an interval in A'^(HI) corresponding to 
a 95% c.l. For a majority of the profiles, this corresponds 
to ±0.15 dex, independent of A^(HI) value. 

The Lya fits for all Lya profiles satisfying A^(HI) > 
2 X 10^° cm~^ criterion are plotted in Figure 1. Overplot- 
ted in each figure are the best fit and our assessment of the 
error corresponding to a 95% c.l. interval. Table 2 sum- 
marizes the absorption redshift, lists the iV(HI) value and 
estimated uncertainty, and gives a brief comment for each 
profile (e.g. difficult continuum, severe line-blending, poor 
SNR). 

For « 10 of the DLA systems in the SDSS-DRl sample, 
we have acquired higher resolution spectroscopy {FWHM s 
30kms~^) of the Lya profile with the Echellette Spec- 
trometer and Imager (Sheinis et al. 2002) on the Keck II 
telescope. The ESI spectra suffer less from line blend- 
ing and also allow for a more accurate determination of 
the quasar continuum. Furthermore, several of these sys- 
tems were observed in previous studies. We find that 
our A'^(HI) values agree with all previous measurements 
to within 0.15 dex with no systematic offset. Therefore, 
we are confident in the A^(HI) values reported here. 




Fig. 2. — Redshift path density g{z) as a function of redshift 
for the (i) SDSS-DRl survey (dotted red hue); (ii) the PMSI03 
compilation (dashed blue line); and (iii) the combined surveys. 



4. ANALYSIS 

4.1. g{z) and n{z) 

A simple yet meaningful description of the statistical 
significance of any quasar absorption line survey is given 
by the redshift path density g{z) (e.g. Lanzetta et al. 
1991). This quantity corresponds to the number of quasars 
searched at a given redshift for the presence of a partic- 
ular absorption feature, e.g., a damped Lya system. We 



have constructed g{z) for the SDSS-DRl sample by im- 
plementing the starting and ending redshifts listed in Ta- 
ble 1. Figure 2 presents g{z) for (i) the SDSS-DRl sample 
(red dotted lines); (ii) the PMSI03 compilation (dashed 
blue lines); and (iii) the combined surveys taking into ac- 
count overlap between the two samples (black solid line). 
It is evident from Figure 2 that the SDSS-DRl sample has 
greatest statistical impact at z = 2 — 3.2. With only « 7% 
of the projected SDSS database, the SDSS-DRl exceeds 
the redshift path density of the previous two decades of 
research ai z — 2.5. Although the SDSS-DRl systems 
have only a modest contribution at z > 3, the projected 
10 X increase in g{z) for the full SDSS sample promises a 
major impact for DLA studies to at least z = A. 




Fig. 3. — Incidence of damped Lya systems per unit redshift n{z) 
as a function of redshift for the SDSS-DRl (red points) and total 
samples (black points). The vertical error bars reflect Ic uncertainty 
assuming Possonian statistics and the horizontal bars indicate the 
redshift interval. The dotted blue line is the fit to n{z) from Storrie- 
Lombardi and Wolfe (2000): n(z) = 0.055(1 + z)i ". 



Granted a determination of g{z), it is trivial to calcu- 
late the number density of DLA systems per unit redshift 
n[z). Integrating n[z) over several redshift bins, we derive 
the results presented in Figure 3 for the SDSS-DRl sam- 
ple (red) and the combined surveys (black). Overplotted 
on the figure is the power-law fit to n{z) from Storrie- 
Lombardi and Wolfe (2000): n{z) = 0.055(1 -I- z)i ". 
The SDSS-DRl sample is in good agreement with pre- 
vious analysis; this bolsters the assertion that our anal- 
ysis has > 95% completeness. The combined data sam- 
ple has uncertainties in n{z) of 10 — 15% for Az — 0.5 
intervals. With future SDSS data releases, we will mea- 
sure n{z) in Az = 0.25 intervals to better than 5% un- 
certainty. This measurement provides an important con- 
straint on the HI cross-section of high redshift galaxies 
(e.g. Nagamine, Springel, & Hernquist 2004a) and thereby 
models of galaxy formation with CDM cosmology (e.g. 
Kauffmann 1996; Ma et al. 1997). Table 3 lists the n(z) 
values for the total sample for the redshift bins shown in 
Figure 3. 
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Table 2 
SDSS DLA SAMPLE 

Name r Zem Zabs 

log jV(HI) Comment 



J003501. 88-091817 


19.10 


2.420 


2.338 


20.55t°:lg 


continuum 




J012230.62+133437 


19.32 


3.010 


2.349 








J012747.80+140543 


18.73 


2.490 


2.442 


20.30ti;;15 


continuum, 


blending 


J013901.40-082443 


18.68 


3.020 


2.677 


2o.7oi;;;i;: 






J021129.16+124110 


18.87 


2.950 


2.595 


20.60t;!;l^ 






J022554.85+005451 


18.97 


2.970 


2.714 


2i.ooi;;;i| 


blending 




J023408.97-075107 


18.97 


2.540 


2.319 








J025512.29-071107 


19.43 


2.820 


2.612 








J025518.58+004847 


19.27 


3.990 


3.254 


20.65l«:l| 


continuum. 


blending 








3.915 


21A0t'^-ll 


continuum. 


blending 


J033854.77-000520 


18.78 


3.050 


2.229 


20.90lg:l^ 


continuum. 


blending, poor SNR 


J074500.47+341731 


19.25 


3.710 


2.995 
3.228 


20.45l°:l| 






J075545.61+405643 


19.23 


2.350 


2.301 


20.35+?, ?9 


blending 




J080137.68+472528 


19.42 


3.280 


3.223 


20.70ti;;l^ 


continuum. 


blending 


J081435.18+502946 


18.34 


3.880 


3.708 


21.35±«;1^ 






J081618.99+482328 


19.17 


3.570 


2.701 
3.436 


20.40t;;'?9 

2o.8ot;;;i5 


continuum 
continuum 




J082535.19+512706 


18.36 


3.510 


3.318 


20.85tj;;15 






J082612.54+451355 


19.23 


3.820 


3.460 


21.35t»;l^ 


blending, poor SNR 


J084039.27+525504 


19.34 


3.090 


2.862 


20.30l°:i| 


continuum 




J084407.29+515311 


19.44 


3.210 


2.775 


21 45+0-15 


continuum 




J090301. 24+535315 


18.56 


2.440 


2.291 


2i.4ot;;- 






J091223.02+562128 


19.09 


3.000 


2.889 


2o.55t;;- 


continuum 




J091955.42+551205 


19.02 


2.510 


2.387 


20.451-- 


continuum 




J092014.47+022803 


19.21 


2.940 


2.351 


20.70t°:i5 


continuum 




J093657.14+581118 


19.03 


2.540 


2.275 


20.35t°;l5 


blending 




J094008.44+023209 


19.41 


3.220 


2.565 


2O.70t[;;15 






J094759.41+632803 


19.17 


2.620 


2.496 


20.65t!5;l5 






J100428.43+001825 


18.50 


3.050 


2.540 

2.685 


21.00t[];l5 
21.35l„;i| 






J104252.32+011736 


18.69 


2.440 


2.267 


20.75l[5;i5 


continuum. 


poor SNR 


J104543.55+654321 


19.10 


2.970 


2.458 


20.85l!5;i5 


continuum 




J110749. 14-011230 


19.22 


3.400 


2.940 


2O.80l!5'i5 


blending 




J113441. 22+671751 


18.59 


2.960 


2.815 


20A0till 


blending 




J114220.26-001216 


18.91 


2.490 


2.258 
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Table 2 - cont 



Name 


r' 






loe AffHIl 


Connnent 


J120144.36+011611 


17.53 


3.230 


2.684 


21.00+!!-;? 

— U. Id 


blending 


J120847.64+004321 


19.19 


2.720 


2.608 


20.45+0-!^ 

— U. ID 




J121238.41+675920 


18.68 


2.570 


2.221 
2.264 


20.40+°- 
20.35+0-|o 

— U.-aU 




J122848.21-010414 


18.23 


2.660 


2.263 






J122924.11-020914 


19.27 


3.620 


2.701 


2o.65i;j;i^ 


blending 


J123131. 88-015350 


19.30 


3.900 


3.670 


20.30+0 




J125131. 73+661627 


19.34 


3.020 


2.777 


20.45+°-« 

— U.iO 


blending 


J125659.79-033813 


19.08 


2.970 


2.434 






J125759.22-011130 


18.87 


4.110 


4.022 


20.35+°-i5 




J130643.07-013552 


18.82 


2.940 


2.773 


20.60+!!-^^ 




J133000.94+651948 


18.89 


3.270 


2.951 


2o.8o+",-i-;; 


blending, poor SNR 


J134811. 22+641348 


19.12 


3.840 


3.555 


21.b0+i\l 

— 0.15 




J135440. 16+015827 


19.07 


3.290 


2.562 


20.80+°-;^ 

— U. 15 




J135828.74+005811 


19.40 


3.910 


3.020 


20.30l°:J5 


blending 


J140200.88+011751 


18.81 


2.950 


2.431 


20.30+5!-i5 




J140248.07+014634 


18.84 


4.160 


3.277 


20.95lo;i^ 




J140501. 12+041535 


19.31 


3.220 


2.708 


20.90l°;15 


poor SNR 


J144752.47+582420 


18.37 


2.980 


2.818 


20.6510:11 


blending 


J145243.61+015430 


18.87 


3.910 


3.253 






J145329.53+002357 


18.58 


2.540 


2.444 


20AQt'aYi, 




J150345.94+043421 


19.49 


3.060 


2.618 


20.40l„-^" 


blending 


J150611. 23+001823 


18.89 


2.830 


2.207 


20.30lo;i5 


continuum 


J163912.86+440813 


19.22 


3.770 


3.642 


20.50l0:;s 




J164022.78+411548 


19.41 


3.080 


2.697 
3.017 


20.5510:1;^ 
20.65l«:- 




J165855.20+375853 


19.13 


3.640 


3.348 


20.95l«:- 


continuum 


J171227.74+575506 


17.46 


3.010 


2.253 


20.6010-15 


blending 


J203642.29-055300 


18.80 


2.580 


2.280 


21 20+0- 18 


continuum, blending 


J205922.42-052842 


19.01 


2.540 


2.210 


20.90+0-^5, 


continuum, blending, poor SNR 


J210025.03-064146 


18.12 


3.140 


3.092 


21.05lo;i| 


blending 


J215117.00-070753 


19.26 


2.520 


2.327 


20.45l0:« 


continuum 


J230623.69-004611 


19.23 


3.580 


3.119 


20.6510-il 


continuum 


J235057.87-005209 


18.79 


3.020 


2.426 
2.615 


20.5510:11 
21.2010:11 


continuum, blending 
continuum, blending 
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Fig. 4. — Cumulative logarithmic incidence of DLA systems per 
unit absorption distance interval dX as a function of logA'^(Hl). 
The red curves correspond to the DLA compilation of PMS103 and 
the black curves refer to the combined sample. Note that the high 
redshift results have changed significantly by including the SDSS- 
DRl sample. 



4.2. ng 

We now turn our attention to the cosmological baryonic 
mass density in neutral gas fig as determined by DLA sur- 
veys. As first described by Wolfe (1986), one can calculate 
fig for a given redshift interval by summing the A^(HI) val- 
ues of all DLA systems within that interval and comparing 
against the total cosmological distance AX surveyed 



ng = 



pLmuHa SiV(HI) 



CPc 



AX 



(1) 



where ^ is the mean molecular mass of the gas (taken to be 
1.3), Hq is Hubble's constant, and pc is the critical mass 
density. We have calculated AX and fig for the SDSS- 
DRl sample and the PMSI03 compilation for a flm = 0.3, 
flA = 0.7, Hq = 70 km s~^Mpc~^ cosmology consistent 
with the current 'concordance' cosmology (e.g. Spergel et 
al. 2003). 

Implicit to Equation 1 is the presumption that the DLA 
systems dominate fig at all redshift. A principal result of 
PMSI03 was that at z > 3.5 there are fewer DLA systems 
with Af(HI) > 10^^ cm^^ and, therefore, that absorption 
systems with iV(HI) < 10^" cm"^ (the so-called sub-DLA) 
will contribute « 50% of fig. This point is partially de- 
scribed by Figure 4 which presents the cumulative cosmo- 
logical number density of DLA systems as a function of H I 
column density. The red curves correspond to the compila- 
tion analyzed by PMSI03; as emphasized by these authors 
there is a significant drop in the fraction of DLA systems 
with large iV(HI) at z > 3.5 in their compilation. The 
authors then argued that the sub-DLA make an impor- 
tant contribution to fig at high redshift. The black lines 
in Figure 4 correspond to the combined sample. There 
is only a modest difference between the PMSI03 and com- 
bined samples for the z — [2.4, 3.5) interval, but at z > 3.5 
(dotted lines) the SDSS-DRl results have greatly changed 



the picture^. Although the SDSS-DRl systems contribute 
only 6 new DLA systems at z > 3.5, half of these have 
iV(HI) > lO^-'^cm"^. The resulting cumulative number 
density at z > 3.5 is now in rough agreement with the 
lower redshift interval (and the predictions of Nagamine 
et al. 2004a). Of course, we suspect the SDSS-DRl sam- 
ple shows an abnormally high fraction of DLA systems at 
z > 3.5 with Af(HI) > 10^^ cm^^. Similarly, we suspect 
the PMSI03 compilation had disproportionality too few 
systems with large iV(HI). This speculation can only be 
tested through a significantly larger sample. 

We can perform an anlysis similar to PMSI03 to es- 
timate the contribution of LLS with iV(HI) = lO^^-^ to 
2Q20.3 f,]-|2-2 ^j^g combined sample. Adopting their 

power law fit to the incidence of LLS n{z)LLS — 0.07(1 -I- 
z)^ **^, one predicts 246.1 LLS with z > 3.5 for the com- 
bined sample where 36 DLA systems are observed. Assum- 
ing the LLS column density distribution follows a power- 
law f{N)LLS = foN{m)'', we derive 7 = -1.31 and 
/o = lO"*'^^. We estimate the contribution of LLS to fig 
by integrating 



fl 



LLS 



cpc 



N f (N) LLS dN = 0.00015 . (2) 



IQi 



This value corresponds to < 15% of fig derived from z > 
3.5 DLA systems (see below). The fractional contribution 
is 3 times lower (and > 4cr lower) than the results from 
PMSI03. It is important to note that this result has large 
statistical and systematic uncertainty. This includes the 
parameterization of n(z)LLS^ the assumed functional form 
oi f{N)LLS: E^nd the statistical uncertainties in all quanti- 
ties including fig. Nevertheless, we conclude that there is 
no longer compelling evidence that Lyman limit systems 
with A^(HI) < 2 X 10^° cm^^ contribute significantly to 
any redshift. Given the current uncertainties, how- 
ever, the exact contribution of the LLS and DLA systems 
to fig will await future studies. 

Restricting our analysis of fig to the DLA systems, we 
derive fig for the SDSS-DRl sample and the combined 
datasets (Figure 5, Table 3). The points plotted in Fig- 
ure 5 are centered at the iV(HI)-weighted redshift in each 
interval and the horizontal errors correspond to the red- 
shift bins analyzed. It is difficult to estimate the error 
in fig because the uncertainty is dominated by sample 
size, especially the column density frequency distribution 
at A^(HI) > 10^^ cm^^. In the current analysis, we es- 
timate la uncertainties through a modified bootstrap er- 
ror analysis. Specifically, we examine the distribution of 
fig values for 1000 trials where we randomly select mi p 
DLA systems for each redshift interval containing m DLA 
systems and where p is a normally distributed random 
integer with standard deviation ^/rn. The bootstrap tech- 
nique provides a meaningful assessment of the uncertainty 
related to sample size provided the observed dataset sam- 
ples a significant fraction of the intrinsic distribution. At 
present, we are not confident that this is the case at any 

^ We also note that more accurate A'^(HI) measurements from 
Prochaska et al. (2003a) indicate that PMSI03 systematically un- 
derestimated several DLA systems with large Af(HI) value. These 
new results are not included in Figure 4, but are included in the 
results presented below. 
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Fig. 5. — Cosmological baryonic mass density in neutral gas f2g 
as derived from the damped Lya systems for the SDSS-DRl sam- 
ple (red points) and the combined sample (black points). The Icr 
vertical error bars were derived from a modified bootstrap analysis 
described in the text. Contrary to previous studies, Qg is rising or 



unchanged to 
decline at z > 4. 



4 and there is only a statistically insignificant 



redshift interval, but particularly at z > 3. The results 
for the z > A redshift interval are an extreme example of 
this concern. The addition of one or two new DLA with 
A'' (HI) > 10^^ cm~^ would significantly increase and 
its Ict uncertainty. Therefore, we caution the reader that 
the 1(7 errors reported in Table 3 likely underestimate the 
true uncertainty. 

The SDSS-DRl sample shows no evidence for a decline 
in rig at high redshift; the results are even suggestive of 
an increasing baryonic mass density at z > 3. We caution, 
however, that the uncertainties are large. Combining the 
SDSS-DRl sample with the previous studies'^, we reach 
a similar conclusion except at z > 4 where the current 
results indicate a drop in fig. As noted above, the results 

We have updated the measurements presented in PMSI03 to match 
the ones presented in Prochaska et al. (2003b). 



in the highest redshift interval are very uncertain owing 
to small sample size. At present, we consider it an open 
question as to whether fJg declines at high redshift. 

One means of assessing the robustness of the Vtg values 
to sample size is to examine cumulatively the total A'^(HI) 
in the various redshift intervals. This quantity is presented 
in Figure 6 as a function of A^(HI) for the combined DLA 
sample. On the positive side, the total A^(HI) for the z < 4 
samples all approach lO^^'^cm"^ which is ~ lOx larger 
than the highest A(HI) values observed to date. There- 
fore, the results in these intervals are reasonably robust 
to the inclusion of an 'outlier' with A(HI) w lO'^^cni^^. 
On the other hand, the curves in Figure 6 demonstrate 
that DLA systems with A^(HI) > 10^^ cm~^ do contribute 
w 50% of the total Af(HI) in each interval. This point 
stresses the sensitivity of ^Ig to sample size; there are 
relatively few DLA systems with A^(HI) > 10^^ cm~^ in 
each interval. Sample variance will be important in any 
given interval for f2g until it includes many systems with 
A^(HI) > 10^1 cm-2. 
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Fig. 6. — Cumulative total A'"(HI) as a function of logA'"(HI) for 
the redshift intervals displayed in Figure 5. These curves provide 
a qualitative assessment of the robustness of the Q,g values to the 
addition of new DLA systems, especially 'outliers' with large Af(HI). 



Table 3 
RESULTS 

Sample z N n(z) AX° ^gjlQ--^ 

SDSS 



Total 



2.1- 


-2.5 


26 


0.211 ±0.037 


505.0 


0.47 


2.5- 


-3.0 


26 


0.254 ±0.046 


420.9 


0.76 


3.0- 


-4.1 


19 


0.296 ±0.065 


266.3 


1.43 


2.0- 


-2.5 


52 


0.189 ±0.026 


880.8 


0.67 


2.5- 


-3.0 


44 


0.215 ±0.032 


704.5 


1.03 


3.0- 


-3.5 


31 


0.271 ±0.049 


421.8 


1.22 


3.5- 


-4.0 


25 


0.366 ± 0.073 


268.0 


1.21 


4.0- 


-5.0 


11 


0.401 ±0.121 


113.5 


0.76 



•+0.12 
-0.12 
+0.20 
'-0.18 
,+0.44 
-0.45 

•+0.16 



'-0.26 



"Assumes a Sim = 0.3, $7a = 0.7, -Ho = 70kms ^ Mpc ^ cosmology. 



5. SUMMARY AND CONCLUDING REMARKS 

In this paper, we have introduced an automated ap- 
proach for identifying DLA systems in the SDSS quasar 
database. We have applied our method to the Data Relase 
1 quasar sample and have identified a statistical sample of 
71 DLA systems including > 50 previously unpublished 
cases. Remarkably, the SDSS Data Release 1 exceeds the 
statistical significance of the previous two decades of DLA 
research at z « 2.5. More importantly, this sample was 
drawn from a well defined, homogeneous dataset of quasar 
spectroscopy. We present measurements of the number per 
unit redshift n{z) of the DLA population and the contri- 
bution of these systems to the cosmological baryonic mass 
density in neutral gas fig. Although the SDSS-DRl sam- 
ple does not offer a definitive assessment of either of these 
quantities, future SDSS data releases will provide a major 
advancement over all previous work. 



12 



The SDSS Damped Lya Survey I 



Our measurements of n{z) are eonsistent with previous 
results suggesting a high completeness level for our DLA 
survey of the SDSS-DRl. We find Qg increases with red- 
shift to at least z = 3 and is consistent with increasing 
to z — 4 and beyond. This latter claim, however, is sub- 
ject to significant uncertainty relating to sample size. Per- 
haps the most important result of our analysis is that the 
full DLA sample no longer shows significantly fewer DLA 
systems with large A''(HI) at z > 3.5. This contradicts 
the principal result of PMSI03 from their analysis of the 
pre-SDSS DLA compilation. Apparently, their maximum 
likelihood approach failed to adequately assess uncertainty 
related to sample size. With the inclusion of only 6 new 
DLA, we no longer find that Lyman limit systems with 
A''(HI) < 2 X 10^°cm~^ are required in an analysis of Og. 

Before concluding, we offer several additional criticisms 
of the PMSI03 analysis and the role of sub-DLA systems. 
First, these authors assumed a three parameter F- function 
for the column density frequency distribution of absorption 
systems with 7V(HI) > lO^^'^cm-^, 

,f(N) = (,U/N^){N/N^)-^c-^/^'. Although this func- 
tion gives a reasonable fit to the column density frequency 
distribution of the DLA systems, it is not physically mo- 
tivated^ and, more importantly, places much greater em- 
phasis on sub-DLA than other functions (e.g. a broken 
power-law). Future assessments must inchide other func- 
tional forms to examine this systematic uncertainty. Sec- 
ond, the authors did not fit for the normalization of the 
distribution function The uncertainty in this param- 
eter could easily c;ontribute an additional > 50% to the 
error budget. Third, their treatment did not account for 
sample variance; the uncertainties these authors reported 
were severe underestimates. Finally (and perhaps most 
importantly), a recent analysis of a sub-DLA sample by 
Dessauges-Zavadsky et al. (2003) has shown that these 
absorption systems have very high ionization fractions (see 
also Howk & Wolfe 2004 in preparation). Although these 
absorption systems may ultimately make an important 
contribution to the total H I mass density of the universe, 
they are intrinsically different from the DLA systems. In- 
deed, a more appropriate title for this sub-set of Lyman 
limit systems is the 'super-LLS'. This gas - in its present 
form - cannot contribute to star formation and is unlikely 
to be directly associated with galactic disks or the inner 
regions of protogalactic 'clumps'. Any interpretation of 
results related to the super-LLS must carefully consider 
these points (e.g. Mailer et al. 2003; Peroux et al. 2003). 

We acknowledge the tremendous effort put forth by the 
SDSS team to produce and release the SDSS survey. We 
thank Art Wolfe, Gabe Prochter, John O'Meara, J. Chris 
Howk, and Ben Weiner for helpful comments and sugges- 
tions. JXP and SHF are partially supported by NSF grant 
AST-0307408 and its REU sub-contract. 
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